Hierarchical Load Balancing for Parallel Fast Legendre Transforms

نویسندگان

  • Nadia Shalaby
  • S. Lennart Johnsson
چکیده

We present a parallel Fast Legendre Transform (FLT) based on the Driscol{Healy algorithm with computation complexity O(N log 2 N). The parallel FLT is load{ balanced in a hierarchical fashion. We use a load{balanced FFT to deduce a load{ balanced parallel fast cosine transform, which in turn serves as a building block for the Legendre transform engine, from which the parallel FLT is constructed. We demonstrate how the arithmetic, memory and communication complexities of the parallel FLT are hierarchically derived via the complexity of its modular blocks.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Computational aspects of a code to study rotating turbulent convection in spherical shells

The coupling of highly turbulent convection with rotation within a full spherical shell geometry, such as in the solar convection zone, can be studied with the new anelastic spherical harmonic (ASH) code developed to exploit massively parallel architectures. Inter-processor transposes are used to ensure data locality in spectral transforms, a sophisticated load balancing algorithm is implemente...

متن کامل

Hierarchical Parallelization of MLFMA for the Efficient Solution of Large-Scale Electromagnetics Problems

We present the details of a hierarchical partitioning strategy for the efficient parallelization of the multilevel fast multipole algorithm (MLFMA) on distributedmemory architectures. Unlike previous parallelization approaches, this strategy is based on the simultaneous distribution of clusters and their fields by considering the optimal partitioning of each level separately. Using the hierarch...

متن کامل

A Hierarchical Parallel Processing System for the Multipass-Rendering Method

The multipass-rendering method integrating radiosity with ray-tracing gives one of the best solutions for synthesizing photo-realistic images. However, the method is also computationally expensive. Therefore, parallel processing is the most promising approach to the fast multipass-rendering method. This paper presents a hierarchical parallel processing system for the multipass-rendering method....

متن کامل

Hierarchical Partitioning and Dynamic Load Balancing for Scientific Computation

Cluster and grid computing has made hierarchical and heterogeneous computing systems increasingly common as target environments for large-scale scientific computation. A cluster may consist of a network of multiprocessors. A grid computation may involve communication across slow interfaces. Modern supercomputers are often large clusters with hierarchical network structures. For maximum efficien...

متن کامل

A Prototypical Self-Optimizing Package for Parallel Implementation of Fast Signal Transforms

This paper presents a self-adapting parallel package for computing the Walsh-Hadamard transform (WHT), a prototypical fast signal transform, similar to the fast Fourier transform. Using a search over a space of mathematical formulas representing different algorithms to compute the WHT, the package finds the best parallel implementation on a given shared-memory multiprocessor. The search automat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997